319 research outputs found

    A fine grained heuristic to capture web navigation patterns

    Get PDF
    In previous work we have proposed a statistical model to capture the user behaviour when browsing the web. The user navigation information obtained from web logs is modelled as a hypertext probabilistic grammar (HPG) which is within the class of regular probabilistic grammars. The set of highest probability strings generated by the grammar corresponds to the user preferred navigation trails. We have previously conducted experiments with a Breadth-First Search algorithm (BFS) to perform the exhaustive computation of all the strings with probability above a specified cut-point, which we call the rules. Although the algorithm’s running time varies linearly with the number of grammar states, it has the drawbacks of returning a large number of rules when the cut-point is small and a small set of very short rules when the cut-point is high. In this work, we present a new heuristic that implements an iterative deepening search wherein the set of rules is incrementally augmented by first exploring trails with high probability. A stopping parameter is provided which measures the distance between the current rule-set and its corresponding maximal set obtained by the BFS algorithm. When the stopping parameter takes the value zero the heuristic corresponds to the BFS algorithm and as the parameter takes values closer to one the number of rules obtained decreases accordingly. Experiments were conducted with both real and synthetic data and the results show that for a given cut-point the number of rules induced increases smoothly with the decrease of the stopping criterion. Therefore, by setting the value of the stopping criterion the analyst can determine the number and quality of rules to be induced; the quality of a rule is measured by both its length and probability

    Real bad grammar: realistic grammatical description with grammaticality

    Get PDF
    Sampson (this issue) argues for a concept of “realistic grammatical description” in which the distinction between grammatical and ungrammatical sentences is irrelevant. In this article I also argue for a concept of “realistic grammatical description” but one in which a binary distinction between grammatical and ungrammatical sentences is maintained. In distinguishing between the grammatical and ungrammatical, this kind of grammar differs from that proposed by Sampson, but it does share the important property that invented sentences have no role to play, either as positive or negative evidence

    Generating dynamic higher-order Markov models in web usage mining

    Get PDF
    Markov models have been widely used for modelling users’ web navigation behaviour. In previous work we have presented a dynamic clustering-based Markov model that accurately represents second-order transition probabilities given by a collection of navigation sessions. Herein, we propose a generalisation of the method that takes into account higher-order conditional probabilities. The method makes use of the state cloning concept together with a clustering technique to separate the navigation paths that reveal differences in the conditional probabilities. We report on experiments conducted with three real world data sets. The results show that some pages require a long history to understand the users choice of link, while others require only a short history. We also show that the number of additional states induced by the method can be controlled through a probability threshold parameter

    From treebank resources to LFG F-structures

    Get PDF
    We present two methods for automatically annotating treebank resources with functional structures. Both methods define systematic patterns of correspondence between partial PS configurations and functional structures. These are applied to PS rules extracted from treebanks, or directly to constraint set encodings of treebank PS trees

    Fuzzy Intervals for Designing Structural Signature: An Application to Graphic Symbol Recognition

    Get PDF
    Revised selected papers from Eighth IAPR International Workshop on Graphics RECognition (GREC) 2009.The motivation behind our work is to present a new methodology for symbol recognition. The proposed method employs a structural approach for representing visual associations in symbols and a statistical classifier for recognition. We vectorize a graphic symbol, encode its topological and geometrical information by an attributed relational graph and compute a signature from this structural graph. We have addressed the sensitivity of structural representations to noise, by using data adapted fuzzy intervals. The joint probability distribution of signatures is encoded by a Bayesian network, which serves as a mechanism for pruning irrelevant features and choosing a subset of interesting features from structural signatures of underlying symbol set. The Bayesian network is deployed in a supervised learning scenario for recognizing query symbols. The method has been evaluated for robustness against degradations & deformations on pre-segmented 2D linear architectural & electronic symbols from GREC databases, and for its recognition abilities on symbols with context noise i.e. cropped symbols

    Trapping dust particles in the outer regions of protoplanetary disks

    Get PDF
    In order to explain grain growth to mm sized particles and their retention in outer regions of protoplanetary disks, as it is observed at sub-mm and mm wavelengths, we investigate if strong inhomogeneities in the gas density profiles can slow down excessive radial drift and can help dust particles to grow. We use coagulation/fragmentation and disk-structure models, to simulate the evolution of dust in a bumpy surface density profile which we mimic with a sinusoidal disturbance. For different values of the amplitude and length scale of the bumps, we investigate the ability of this model to produce and retain large particles on million years time scales. In addition, we introduced a comparison between the pressure inhomogeneities considered in this work and the pressure profiles that come from magnetorotational instability. Using the Common Astronomy Software Applications ALMA simulator, we study if there are observational signatures of these pressure inhomogeneities that can be seen with ALMA. We present the favorable conditions to trap dust particles and the corresponding calculations predicting the spectral slope in the mm-wavelength range, to compare with current observations. Finally we present simulated images using different antenna configurations of ALMA at different frequencies, to show that the ring structures will be detectable at the distances of the Taurus Auriga or Ophiucus star forming regions.Comment: Pages 15, Figures 14. Accepted for publication in Astronomy and Astrophysic

    Metaheuristics for Natural Language Tagging

    Full text link
    • 

    corecore